Example-based Machine Translation Based on Syntactic Transfer with Statistical Models

نویسندگان

  • Kenji Imamura
  • Hideo Okuma
  • Taro Watanabe
  • Eiichiro Sumita
چکیده

This paper presents example-based machine translation (MT) based on syntactic transfer, which selects the best translation by using models of statistical machine translation. Example-based MT sometimes generates invalid translations because it selects similar examples to the input sentence based only on source language similarity. The method proposed in this paper selects the best translation by using a language model and a translation model in the same manner as statistical MT, and it can improve MT quality over that of ‘pure’ example-based MT. A feature of this method is that the statistical models are applied after word re-ordering is achieved by syntactic transfer. This implies that MT quality is maintained even when we only apply a lexicon model as the translation model. In addition, translation speed is improved by bottom-up generation, which utilizes the tree structure that is output from the syntactic transfer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Lexical Ambiguity Resolution for Turkish in Direct Transfer Machine Translation Models

This paper presents a statistical lexical ambiguity resolution method in direct transfer machine translation models in which the target language is Turkish. Since direct transfer MT models do not have full syntactic information, most of the lexical ambiguity resolution methods are not very helpful. Our disambiguation model is based on statistical language models. We have investigated the perfor...

متن کامل

Syntactic Transformations Modelling for Hybrid Machine Translation

The paper deals with the problems of crosslanguage syntactic transformations modelling for design and development of transfer-based machine translation systems. The solutions are proposed on the basis of the hybrid grammar comprising linguistic rules and statistical information about the language structures preferred in particular languages.

متن کامل

Statistical Machine Translation Using Coercive Two-Level Syntactic Transduction

We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena without requiring a full parse in either language. We also examine aspects of lexical transfer, su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004